Information Retrieval from an Incomplete Data

نویسنده

  • Curtis Dyreson
چکیده

A complete data cube is a data cube in which every aggregate value in the multidimensional space is stored or can be computed. An incomplete data cube is a data cube in which points in the multidimensional space are missing and cannot be computed. This paper describes an incomplete data cube design. An incomplete data cube is modeled as a federation of cubettes. A cubette is a complete sub-cube within the incomplete data cube. The incomplete cube is built piecemeal by giving a concise, high-level speciication of each cu-bette. An eecient algorithm to retrieve an aggregate value from the incomplete data cube is described. When a value cannot be retrieved because it is missing, alternatives at a lower precision that can be retrieved are iden-tiied. When a value can be partially computed (i.e., some of the values lower in the hierarchy are missing, but some are present) a measure of the completeness of the result is supplied along with the partially aggregated value. The design also includes an algorithm that removes redundant cubettes and an algorithm to increase the retrieval power of the federation through the creation of virtual cu-bettes. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble-based Top-k Recommender System Considering Incomplete Data

Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Information Retrieval from an Incomplete Data Cube

A complete data cube is a data cube. in which every aggregate value in the multidimensional space is stored or can be computed. An incomplete data cube is a data cube in which points in the multidimensional space are missing and’cannot be computed. This paper describes an incomplete data cube design. An incomplete data cube is modeled as a federation of cubettes. A cubette is a complete subcube...

متن کامل

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996